Integrating CD4 data into undiagnosed estimates

Martina Morris & Jeanette Birnbaum

Project Goal

Use CD4 at diagnosis to increase the precision of undiagnosed HIV estimates from the testing history model

Why

  • The testing history model relies heavily on the inter-test interval, and the distribution of infection probability across that interval.

    • For cases diagnosed on their first test, or with negative tests that are far in the past, this creates a long window of possible infection.
    • Assuming the distribution of infection probability is uniform across that interval decreases the precision of our estimate
    • These cases are not very informative regarding time of infection in our current model
  • High CD4 measured at diagnosis is an indicator of recent infection

    • So it can be used to identify cases with long windows who are likely to have been infected recently.

How

  • Identify typical times from infection associated with CD4 counts from the research literature;

  • Use this to reallocate the probability of infection within long infection windows.

  • NOTE: we only reallocate from a uniform distribution towards more recent infection when indicated by higher CD4. We do not reallocate towards a less recent infection when indicated by low CD4. So the potential impact on the undiagnosed estimate is only possible in one direction: to reduce the undiagnosed fraction.

Summary of Results

Impact in WA

  • Modest reduction in estimates of undiagnosed cases (about 10%)

  • Consistent with the findings from the previous analysis incorporating the BED for recent infection, but with more confidence because the level of missing data is much lower for CD4 (25% vs. 70% for the BED)

  • But we need to be careful now because our key remaining assumptions may be leading to downward bias in our estimates.

Outline of presentation

Introduction

  1. Review of the Testing History method for estimating undiagnosed cases of HIV

  2. Review of key testing history data descriptives in WA State

  3. How we integrate CD4 data into the method

Results

  1. CD4 data descriptives in WA State

  2. Impact on estimated median time since infection, and time spent undiagnosed

  3. Impact on estimated undiagnosed cases in 2014

Intro 1: Brief Review of the Testing History (TH) method

Base Case TH

Distributes probability of infection uniformly across the possible infection window

Longer windows have less probability assigned to the recent period plot of chunk unnamed-chunk-1

The hazard (instantaneous rate) of infection at any point within the window is 1/(window length), shown by the red line for each window lengths. Time = 0 refers to time of diagnosis.

The cumulative probability of diagnosis is the area under the curve. The 2-year window assigns greater probability of infection within the last year than the 5-year window, shown by the grey shaded region.

TH Never-tester assumption: A window of min(age-16 or 18 years)

The age distribution of never-testers will influence the window lengths they contribute to the analysis.

Effect of age on windows for never-testers plot of chunk unnamed-chunk-2

Intro 2: Testing history data patterns in WA, 2005-2014

59% of cases have testing history

Testing histories = Tested + Never Tested

Breakdown of testing histories in WA HIV cases 2005-2014. plot of chunk unnamed-chunk-3

Total N = 5148
With testing history N = 3016.

MSM are more likely to have tested before, and less likely to be missing

66% of MSM versus 43% of non-MSM have testing histories

Testing history by group (Percents are within group) plot of chunk everHadbyMSM

  • MSM comprise about 2/3 of cases.

  • What is driving differential response rates?

Half of all windows are 2 years or less (red + gold)

Distribution of infection window lengths in years (N = 3016) plot of chunk wd

Labels indicate bounds, e.g. (0,1] includes windows >0 and <= 1. The 18-year windows are never-testers (NT).

  • These cases will not be modified by CD4 counts

About 1/3 of all windows are 5 years or longer (blue + purple)

Distribution of infection window lengths in years (N = 3016) plot of chunk wd2

  • These are the cases with the greatest potential for CD4 modification, but only if their CD4 counts at diagnosis are high.

  • Less than half of these cases are never-testers (13/31 = 42%).

Median window lengths: about 1 year for MSM versus 5 years for non-MSM

Distribution of infection window lengths by group (Percents are within group) plot of chunk wd-MSM

  • 58% of non-MSM have windows 5 years or longer, and about half of these are never testers.

More concurrent HIV/AIDS Dx among non-MSM

Percent of cases with a concurrent HIV/AIDS diagnosis by group. plot of chunk unnamed-chunk-4

  • Consistent with the longer window lengths for non-MSM

MSM Never-testers are younger

Age distribution of never-testers by group

plot of chunk unnamed-chunk-5

  • Mean age at diagnosis among never-testers is 36 in MSM vs. 42 in non-MSM

  • This will also lead to shorter windows for MSM.

Summary: non-MSM have greater potential for modified estimates

Whether it's using BED, CD4, or concurrent AIDS Dx, etc

  • Longer windows of possible infection for non-MSM = more uncertainty regarging time of infection, more opportunity to reallocate when these measures indicate recent infection and to improve the precision of our estimates.

  • Short testing intervals in MSM = low uncertainty regarding time of infection, less potential impact of additional information

Increasing precision will not necessarily decrease undiagnosed estimates

  • Modified estimates will deviate from the Base Case only to the extent that these added biomarkers indicate recent infection

  • An example where this could happen is if cases with long windows tested due to risky exposure: CD4 should help pick up on that, and this would reduce the undiagnosed estimates.

Overall estimates may not change much, but our confidence in them will increase

Intro 3: Integrating CD4 into the testing history method

CD4 levels by time since infection

We expect individual CD4 trajectory after HIV infection is something like this:

With CD4 steadily falling over time in the absence of treatment.

CD4 levels by time since infection

The real picture is more like this:

CD4 progressions are highly variable across individuals.

  • There is still a broad trend that can be summarized
  • But it is summarized in coarse categories, rather than continuous measures

Estimated CD4 levels by time since infection (from the research literature)

The standard measure is the number of years it takes for 50% of untreated cases to reach a CD4 level threshold:

CD4 Threshold Lodi 2011 Cori 2015 We use*
500+ 1.3 2.3 1.5
350 4.3 4.2 4.0
200 7.9 8.0 8.0
< 200 11.5 9.0**

* Nearest rounded median times that are consistent with both sources. ** For CD4 < 200, we retain our maximum-window assumption of 18 years, which implies a median time of 9 years.

Interpretation: 50% of infection probability should occur within the median time.

Implication: If a case has CD4 > 500 at diagnosis, for example, we allocate 50% of the infection probability to the 1.5 years prior to Dx, and the remaining 50% to the rest of the window.

Lodi S, Phillips A, Touloumi G, Geskus R, Meyer L, Thiébaut R, et al. Time from human immunodeficiency virus seroconversion to reaching CD4+ cell count thresholds <200, <350, and <500 Cells/mm3: assessment of need following changes in treatment guidelines. Clin Infect Dis Off Publ Infect Dis Soc Am. 2011 Oct;53(8):817–25

Cori A, Pickles M, van Sighem A, Gras L, Bezemer D, Reiss P, et al. CD4+ cell dynamics in untreated HIV-1 infection: overall rates, and effects of age, viral load, sex and calendar time. AIDS Lond Engl. 2015 Nov 28;29(18):2435–46.

How we incorporate CD4 into the TH method

Example: re-allocating infection probability for an 18 year window

  • Base Case: Uniform distribution, so 50% is in each half of the window (0-9] and (9-18]

Allocation of the probability of infection plot of chunk unnamed-chunk-6

How we incorporate CD4 into the TH method

Example: re-allocating infection probability for an 18 year window

  • Base Case: Uniform distribution, so 50% is in each half of the window (0-9] and (9-18]

  • CD4 Case: 50% of infection probability is reallocated to the CD4-based median window

Allocation of the probability of infection plot of chunk unnamed-chunk-7
Base Case (red) versus CD4 Cases (blue) for the three new CD4 threshold values. Shading indicates the new placement of the initial 50% of infection probability.

Note that for cases with CD4 < 200, there is no change from the Base Case.

Impact will come from people with "long" windows and high CD4s

CD4 Category CD4 Median Impacts windows longer than
>500 1.5 3.0
350-500 4.0 8.0
200-350 8.0 16.0

Actual impact will depend on how much longer windows are than 2x the CD4-based median

  • 18-year windows among individuals with CD4>500 will provide much greater impact than 4-year windows, for example
  • Even 18-year windows will have minimal impact among CD4 200-350, since the Base Case median for 18-year windows is 9 years and the CD4-based median is 8 years (not much difference)

Results 1: CD4 data patterns in WA

28% of cases with testing history do not have useable CD4 data

Why: We exclude cases when their first CD4 count is not within 30 days of HIV Dx, or if it's missing
Treatment is likely to begin soon after diagnosis, altering CD4 counts

CD4 counts among excluded cases by length of delay or missingness plot of chunk unnamed-chunk-8

  • Note the longer the delay, the higher the CD4 count, which is consistent with initiation of treatment.

  • It may be possible to refine this exclusion by using info on treatment initiation & viral load.

Long windows are weakly correlated with low CD4

For those with a valid first CD4 measure (N=2178)

CD4 count vs infection window length, by testing status (columns) and group (rows). plot of chunk unnamed-chunk-9

  • Recall that Never Testers get a window length of min(age-16, 18) years.

  • Note how much variability there is, even for the short-window cases!
    ** Confirms CD4 is a very noisy (i.e., non-informative) measure of recency

Low CD4 is more common in never-testers

CD4 distribution by testing history (colors) and group (panels). plot of chunk cd4veverHad

  • Shaded area indicates CD4 < 500 in never-testers.

  • Suggests delayed testing, maybe in response to HIV illness-related symptoms?

  • Implies the CD4 data will not modify the estimates for most of these long-window cases.

However, some never-testers do have high CD4

CD4 distribution by testing history (colors) and group (panels). plot of chunk cd4veverHad500

  • Shaded area indicates CD4 > 500 in never-testers

  • This is where the CD4 case has the greatest potential impact (if the never-testers are older)

Impact: 10% of cases are modified by the CD4 data

Cases are modified if their observed window is at least 2x longer than the CD4-based median window

Cross-tabulation of window length by CD4 bin. Green = modified by the CD4 Case plot of chunk unnamed-chunk-10

  • Percents are of the total N with testing histories(N = 3016). Dot sizes proportional to percent.

CD4 case modifies 7% of MSM versus 17% of non-MSM

Cross-tabulation of window length by CD4 bin, by group plot of chunk unnamed-chunk-11
The group with the smallest impact is CD4 = 250-300 and window > 16

  • more common among non-MSM: 5%, or 5/17 = 29% of modified cases

  • than MSM: 1%, or 1/7 = 14% of modified cases

Results 2: CD4 impact on estimated time from infection to diagnosis

Estimated median time since infection declines slightly

Median time since infection = time by which 50% of infection probability has occurred

Estimated median time since infection under the Base Case (blue) and CD4 Case (orange) by CD4 bin. plot of chunk unnamed-chunk-12

  • Among all cases with testing history (N = 3016)

  • Declines are greater among the groups with higher CD4 levels.

Absolute decreases in the median time since infection are smaller for MSM




This is what we expected, that the CD4 information would have less impact on MSM.

Average median time since infection under the Base Case (blue) and CD4 Case (orange), by group plot of chunk unnamed-chunk-13




This result is a simple product of the proportion of cases modified, and the average change, for each group

Contributions to the absolute difference in median time since infection by group.

Population % Impacted Average Change Abs. Difference
MSM 0.07 -3.10 -0.22
non-MSM 0.17 -3.01 -0.51

Among all cases with testing history (N = 3016)

But the relative decline is the same for both groups: about 11%

This was a bit surprising

Why are they the same?

  • Because the relative decline also depends on the starting values (Base Case estimate)*

Contributions to the relative difference in median time since infection by group.

Population Base Case Abs. Difference Rel. Difference
MSM 1.94 -0.22 -11.3
non-MSM 4.50 -0.51 -11.3

Average time spent undiagnosed decreases by about 6% in both groups

Average time undiagnosed = the mean of the TID curve

Mean TID for the Base Case and CD4 Case, by MSM status

Population Base Case CD4 Case Difference Percent Change
MSM 1.83 1.72 0.11 -6.13
non-MSM 4.38 4.13 0.25 -5.74



  • Again you can see the absolute decline is greater for non-MSM, but the relative decline is about the same (though slightly greater for MSM in this case)

Results 3: CD4 impact on undiagnosed estimates for 2014

Undiagnosed estimates decrease 5-6% in 2014

Undiagnosed cases

Subgroups parallel the decreases in mean undiagnosed time (6.1% for MSM and 5.7% for non-MSM)

Population Base Case CD4 Case Difference Percent Change
Total 1319.0 1247.0 -72.0 -5.5
MSM 604.7 568.4 -36.3 -6.0
non-MSM 714.3 678.2 -36.1 -5.1

Undiagnosed fractions

Here the MSM declines are relatively larger: -6.5 vs. -4.0 for non-MSM

Population Base Case CD4 Case Absolute Difference Percent Change
Total 9.4 8.9 -0.5 -5.3
MSM 6.2 5.8 -0.4 -6.5
non-MSM 17.1 16.4 -0.7 -4.0
  • The fractions take into account diagnosed PLWH.
    • Since the Base Case undiagnosed fraction is much higher for non-MSM, it is less sensitive than the MSM fraction to the CD4 Case's decrease of about 36 undiagnosed cases.

Conclusions, limitations and possibilities

Key findings for WA State

In WA, most cases with long windows were not recently infected

Only 10% of cases with testing history had infection windows that indicated less probability of recent infection than indicated by their CD4 count

In WA, CD4 data incorporation had the same impact on both MSM and non-MSM estimates

We expected to see greater impact in non-MSM than MSM

This was due to offsetting differences in the three components that influenced the estimate:

  • Percent modified: MSM had a lower fraction of cases modified ** 7% for MSM vs 17% for non-MSM

  • Size of modification: MSM had larger decreases in their median time since infection ** -2.3 yrs for MSM vs -1.9 yrs for non-MSM

  • Starting values from Base Case estimates: MSM had lower base case estimates ** 6.2% for MSM vs 9.4% for non-MSM

This translated into slightly higher impacts on mean TID and mean undiagnosed estimates for MSM

Limitations

We prioritize testing history data

  • If testing history indicates more recent infection than CD4 does, we use the testing history
  • What fraction of our LNT data is unverified self-report?
  • Little is known about the accuracy of self-reports of the last negative test before diagnosis

We use CD4 conservatively

  • Literature on CD4 trajectories is fairly sparse
  • Data indicate plenty of heterogeneity, hence the need for using a fairly conservative approach for the CD4 Case

Our estimates now may be downwardly biased

  • We are systematically including data that reduce undiagnosed estimates (e.g., CD4 or BED)
  • But we are ignoring data that may increase estimates (e.g., concurrent Dx)

So we do not recommend using the CD4-based estimates for publication

Future work

Recall that we are still using the “missing at random” assumption for cases without a testing history

  • But they have a higher prevalence of concurrent HIV/AIDS diagnoses
  • And a lower prevalence of BED-based recent infection
  • We could also use CD4 data in cases with missing testing history

    • This is one major factor that may be contributing to a downward (i.e., too optimistic) bias in our undiagnosed estimates now

Indicators of recent infection by testing history status

Non-Missing Missing
Percent BED + 23 8
Mean CD4 395 328
Median CD4 372 280
Percent with Concurrent Dx 30 42
  • Among 3589 cases (N=70%) with CD4 measured within 30 days

Longer term: We want to integrate multiple markers

  • Use a Bayesian approach to combine information from BED, CD4 and concurrent diagnoses

Appendix: Additional details

Among impacted cases, MSM actually have slightly greater decreases in median time since infection

Among impacted cases (N = 296), density of differences in the median time since infection comparing the Base Case to the CD4 Case. Lines indicate the means of the distributions. plot of chunk unnamed-chunk-15

Mean decrease in median time since infection is slighly larger for MSM (3.10 years) than for non-MSM (3.01 years).

Among impacted cases, MSM actually have slightly greater decreases in median time since infection

Among impacted cases (N = 296), density of differences in the median time since infection comparing the Base Case to the CD4 Case. Lines indicate the medians of the distributions. plot of chunk unnamed-chunk-16

Median decrease in median time since infection is slighly larger for MSM (2.34 years) than for non-MSM (1.94 years).

Overall impact on TID is subtle

MSM

Time from infection to diagnosis: probability curve (top) and undiagnosed fraction curve (bottom) plot of chunk unnamed-chunk-17

Overall impact on TID is subtle

non-MSM

Time from infection to diagnosis: probability curve (top) and undiagnosed fraction curve (bottom) plot of chunk unnamed-chunk-18

Incidence and undiagnosed results over time